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Abstract 

A novel lattice coding framework is proposed for outage-limited cooperative channels. This frame- 
work provides practical implementations for the optimal cooperation protocols proposed by Azarian et 
al.. In particular, for the relay channel we implement a variant of the dynamic decode and forward 
protocol, which uses orthogonal constellations to reduce the channel seen by the destination to a single- 
input single-output time-selective one, while inheriting the same diversity-multiplexing tradeoff. This 
simplification allows for building the receiver using traditional belief propagation or tree search archi- 
tectures. Our framework also generalizes the coding scheme of Yang and Belfiore in the context of 
amplify and forward cooperation. For the cooperative multiple access channel, a tree coding approach, 
matched to the optimal linear cooperation protocol of Azarain et al, is developed. For this scenario, 
the MMSE-DFE Fano decoder is shown to enjoy an excellent tradeoff between performance and com- 
plexity. Finally, the utility of the proposed schemes is established via a comprehensive simulation 
study. 

Introduction 

Lately, cooperative communications has been the focus of intense research activities. As a consequence, 
we now have a wealth of results covering a wide range of channel models and/or system design as- 
pects [1-10]. In this paper we focus on outage-limited cooperative channels, where a slow fading model 
is assumed for both relay and cooperative multiple access (CMA) scenarios. We further impose the half 
duplex constraint limiting each node to either transmit or receive at any point in time. The primary goal in 
this setting is to construct strategies that exploit the available cooperative diversity with a reasonable en- 
coding/decoding complexity. Our work is inspired by the cooperation protocols of Azarian et al. in [1]. In 
particular, for the relay channel we implement a novel variant of the dynamic decode and forward (DDF) 
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strategy that, through judicious use of orthogonal space-time constellations, reduces the channel seen by 
the destination to a single-input single-output (SISO) time-selective channel. This variant achieves the 
excellent diversity-multiplexing tradeoff (DMT) of the DDF protocol [1], while minimizing the decoding 
complexity at both the relay and the destination nodes. We then present a tree coding/decoding implemen- 
tation of the optimal cooperative multiple access (CMA) strategy proposed in [1]. This implementation 
employs the minimum mean square error decision feedback equalizer (MMSE-DFE) Fano decoder [11], 
to approximate the maximum-likelihood (ML) performance at a much lower complexity. In summary, 
our contributions in this paper are twofold. First, we establish the practical value of the information- 
theoretically optimal cooperation protocols proposed in [1]. This goal is accomplished by constructing low 
complexity lattice coding/decoding implementations of the DDF and CMA cooperation strategies, which 
are shown to significantly outperform the recently proposed codes by Yang and Belfiore [10]. Second, we 
elucidate the performance-complexity tradeoff and the parameters controlling it, in different cooperation 
settings. 

The rest of the paper is organized as follows. Section|2]introduces our system model and notations. In 
Section |3l we briefly review the lattice coding/decoding framework adopted in our work. The proposed 
coding/decoding schemes for the outage-limited relay channel are detailed in Section |U Section |5] is 
devoted to the description of our tree coding/decoding approach for the CMA channel. Finally, we offer 
some concluding remarks in Section[6] 

2 System Model and Notation 

In this section, we state the assumptions that apply to the two scenarios considered in this paper (i.e., 
relay and CMA channels). Assumptions pertaining to a specific channel will be given in the related 
section. All channels are assumed to be flat Rayleigh-fading and quasi-static, i.e., the channel gains remain 
constant during one codeword and change independently from one codeword to the next. Furthermore, 
the channel gains are mutually independent with unit variance. The additive noises at different nodes are 
zero-mean, mutually-independent, circularly-symmetric, and white complex-Gaussian. The variances of 
these noises are proportional to one another such that there are always fixe d offsets between the different 
signal-to-noise ratios (SNRs). All nodes have the same power constraint, have a single antenna, and 
operate synchronously. Only the receiving node of any link knows the channel gain; no feedback to the 
transmitting node is permitted. All cooperating partners operate in the half-duplex mode, i.e., at any point 
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in time, a node can either transmit or receive, but not both. This constraint is motivated by, e.g., the 
typically large difference between the incoming and outgoing signal power levels. Next we summarize the 
notation used throughout the paper. 

1 . The SNR of a link, p, is defined as 

a E 

P = — , (1) 

a z 

where E denotes the average energy available for transmission of a symbol across the link and a 2 
denotes the variance of the noise observed at the receiving end of the link. We say that f(p) is 
exponentially equal to p b , denoted by f(p)=p b , when 

lim = b. (2) 

P-+00 log(p) 

In ©, b is called the exponential order of f(p). < and > are defined similarly. 

2. Assuming that g is a complex Gaussian random variable with zero mean and unit variance, the 
exponential order of 1/|^| 2 is defined as 

v = -hm - — -— . (3) 

p^oo log(p) 

The probability density function (PDF) of v has the following property [12] 



p-°° = 0, for v < 0, 

l>r={ ■ (4) 

p~ v , for v > 

3. Consider a family of codes {C p } indexed by operating SNR p, such that the code C p has a rate of 
R(p) bits per channel use (BPCU) and ML error probability P e (p)- For this family, the multiplexing 
gain r and the diversity gain d are defined as 

rihM, d4 _ Um log(P.M) 

p^oo log p p^oo log p 

4. The problem of characterizing the optimal DMT in a point-to-point communication system over a 
coherent quasi-static flat Rayleigh-fading channel was posed and solved by Zheng and Tse in [13]. 
For a MIMO communication system with M transmit and iV receive antennas, they showed that, 
for any r < min{M, N}, the optimal diversity gain d*(r) is given by the piecewise linear function 
joining the (r, d) pairs (k, (M — k)(N — k)) for k — 0, min{M, N}, provided that the code-length 
/satisfies / > M + N - 1. 



5. We say that protocol A uniformly dominates protocol B if, for any multiplexing gain r, d A (r) > 
d B {r). 

6. We say that protocol A is Pareto optimal, if there is no protocol B that dominates protocol A in the 
Pareto sense. Protocol B is said to dominate protocol A in the Pareto sense if there is some r for 
which d B (r ) > dA(r ), but no r such that d B {r) < dA(r). 

7. We denote the ratio of the destination variance to that of inter-user noise variance by c, i.e., c = 

8. Throughout the sequel, vectors are denoted by bold lowercase characters (e.g., x), and matrices 
are denoted by bold uppercase characters (e.g., H). Z, R, C refer to the ring of integers, field 
of real numbers, and field of complex numbers, respectively. We refer to the identity matrix of 
dimension M as I M and Kronecker product by (g>. We also use (x) + to mean max{i, 0}, (x)~ to 
mean min{x, 0} and \x] to mean nearest integer to x towards plus infinity. 



3 Lattice Coding and Decoding 



An m-dimensional lattice A C R m is the set of points 



A = {A = Gu : uGZ m } 



(6) 



where G G M. mxm is the lattice generator matrix. Let 77 e R m be a vector and 1Z a measurable region in 
R m then a lattice code C(A, 77, 7V) is defined as the set of points of the lattice translate A + 77 inside the 
shaping region TZ [14], i.e., 

C(A,T),TV) = {A + r;}n^. (7) 

Here, we focus on construction "A" lattice codes [15]. In this construction, A = C + QZ m with C C Zq 
being a linear code over Zq and Q is a prime. The generator matrix of A is given by 



G 



I 
P QI 



(8) 



where [I, P T ] T is the generator matrix of C (in a systematic form) [15]. The shaping region TZ of the 
lattice code C(A,rj,TV) can be 1) the m-dimensional sphere, 2) the fundamental Voronoi region of a 
sublattice A' C A, or 3) the m-dimensional hypercube. These three alternatives provide a tradeoff between 
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performance and encoding complexity. The spherical shaping approach yields the maximum shaping gain 
but suffers from the largest encoding complexity. Encoding is typically done via look-up tables which 
limits this approach to short block lengths (since the number of entries in the table grows exponentially 
with the block length). Voronoi shaping offers a nice compromise between performance and complexity. 
In this approach, the encoding complexity is equivalent to that of m-dimensional vector quantization 
which, loosely speaking, ranges from polynomial to linear (in the block length) based on the choice of 
the shaping lattice. On the other hand, with large enough m and the appropriate choice of A', the shaping 
gain approaches that of the sphere. The third alternative, i.e., hypercubic shaping, allows for the lowest 
encoding complexity, however, at the expense of a performance loss (i.e., the worst shaping gain). As a side 
benefit, hypercubic shaping also minimizes the peak- to-average power ratio. The final ingredient in lattice 
coding is the translate vector 77 which is used to maximize the number of lattice points inside 1Z and/or 
randomize the distribution of the codebook over 72. [14, 16]. In fact, Voronoi coding with rj uniformally 
distributed over the Voronoi cell of A' corresponds to the mod- A approach of Erez and Zamir [16]. The 
proposed coding approaches in our work can be coupled with any shaping technique and any choice for 
the translate 77. The optimization of these parameters is beyond the scope of this paper, and hence, will not 
be considered further. For simplicity of presentation, and implementation, we will focus in our simulation 
study on hypercubic shaping. 

The primary appeal of lattice codes, in our framework, stems from their amenability to a low complex- 
ity decoding architecture, as argued in the sequel. For decoding purposes, we express C(A, rj, 1Z) as the 
set of points x given by 

x = Gu + 77, for u G U (9) 

where U C Z m is the code information set. Assuming that the code is used over a linear Gaussian channel, 
then the input-output relation is given by 

r = Hx + z (10) 

where r 6 W 1 denotes the received signal vector, z ~ A/"(0, cr 2 I) is the AWGN vector, and H 6 ^ nxm 
is a matrix that defines the channel linear mapping. In the coherent paradigm, where H is known to the 
receiver, the maximum likelihood (ML) decoding rule reduces to 

u = arg min |r — H77 — HGul 2 (11) 

The optimization problem in dTTb can be viewed as a constrained version of the closest lattice point search 
(CLPS) with lattice generator matrix given by HG and constraint set U [11]. This observation inspired the 
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class of sphere decoding algorithms (e.g., [17]). More recently, a unified tree search decoding framework 
which encompasses the sphere decoders as special cases was proposed in [11]. Of particular interest to 
our work here is the (MMSE-DFE) Fano decoder discovered in [11]. In this decoder, we first preprocess 
the channel matrix via the feedforward filter of the MMSE-DFE then we apply the celebrated Fano tree 
search algorithm to identify the closest lattice point. In particular, we attempt to approximate the optimal 
solution for 

u = arg min IF (r - Hry - HGu) I 2 , (12) 

uez m 

where F is the feedforward filter of the MMSE-DFE. It is important to note the expanded search space 
in (fT2l . i.e., we replaced U in ([HI) with Z m '. Relaxing this constraint results in a significant complexity 
reduction since enforcing the boundary control u E U can be computationally intensive for non-trivial 
lattice codes [11]. While the search space expansion is another source for sub-optimality, it was shown 
in [11, 18] that the loss in performance is very marginal only when the MMSE-DFE preprocessing is 
employed (i.e., with F = I one would see a significant performance loss). 

One of our main contributions is showing that the MMSE-DFE Fano decoder yields an excellent 
performance- vs-complexity tradeoff when appropriately used in cooperative channels. For more details 
about this decoder, the reader is referred to [11]. 

4 The Relay Channel 

For exposition purposes, we limit our discussion to the single relay scenario. The proposed techniques, 
however, extend naturally to channels with an arbitrary number of relays. 

4.1 Amplify and Forward (AF) Cooperation 

In [1], the non-orthogonal amplify and forward (NAF) strategy was shown to achieve the optimal DMT 
within the class of AF protocols. In NAF relaying, the source transmits on every symbol-interval in 
a cooperation frame, where a cooperation frame is defined as two consecutive symbol-intervals. The 
relay, on the other hand, transmits only once per cooperation frame; it simply repeats the (noisy) signal 
it observed during the previous symbol-interval. It is clear that this design is dictated by the half-duplex 
constraint, which implies that the relay can repeat at most once per cooperation frame. We denote the 
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repetition gain by b and, for frame k, the information symbols are denoted by {x 3j fe}| =1 . The signals 
received by the destination during frame k are given by: 



Vi,k = 9i x i,k + vi,k, 

V2,k = 9ix 2 ,k + Shb(hx ltk + w hk ) + v 2 , k - 



(13) 
(14) 



Note that, in order to decode the message, the destination needs to know the relay repetition gain b, the 
source-relay channel gain h, the source-destination channel gain g x , and the relay-destination channel gain 
g%. The following result from [1] states the DMT achieved by this protocol. 



Theorem 1 The diversity-multiplexing tradeoff achieved by the NAF relay protocol is 

d(r) = 1 -r + (1 -2r) + . 



(15) 



In [1], the achievability of dT3t was established using a long Gaussian codebook which spans infinitely 
many cooperation frames (with the same channel coefficients). More recently, Yang and Belfiore have pro- 
posed a novel scheme that achieves the tradeoff in <fT3T) by only coding over one cooperation frame. Yang 
and Belfiore design is inspired by the fact that the input-output relationship in (fT3l and (fl4ll corresponds 
to the following 2x2 MIMO channel 



9i 







\ 92 b\2 +c 92bh / lg2bl 2 +c 9i 



(16) 



where z 6 C 2 is the noise vector with circularly symmetric i.i.d. Gaussian components, ~ Ac(0, a^). 
One can then use any of the 2x2 linear dispersion (LD) constellations [19] as a cooperation scheme in this 
setup. A 2 x 2 LD constellation is obtained by multiplying a 4-dimensional QAM vector u by a generator 
matrix. As shown in [10], by setting the generator matrix to that of the so called Golden constellation, i.e., 



where 9 = , 9=1 — 9,a = l + i9 and a = 1 + i9, one can use the non- vanishing determinant 
property to establish the achievability of (IT31) by this scheme. The lattice decoding framework, adopted 
here, can be applied to this approach by applying the appropriate scaling factor and separating the real and 
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imaginary parts of the received signal. This reduces dl3T > to our model in (ITOfc where U = Zq corresponds 
to an input Q 2 -QAM constellation. 

In applications where the code word is allowed to span multiple cooperation frames, Yang and Belfiore 
Golden constellation fails to exploit the long block length in improving the coding gain. By appealing to 
the lattice coding framework, one can improve the coding gain while still using the same decoder. For 
example, we can concatenate the inner Golden constellation with an outer trellis code whose constraint 
length is allowed to increase with the block length. The improved coding gain of this approach translated 
into enhanced frame error rates (as validated by the numerical results in Section H31) . In our design, we 
generate our trellis code as a systematic convolutional code (CC) over Zq. Assuming that one code word 
spans iV cooperation frames, the received vector can now be expressed as 

y = HG; c x + z (17) 
= HG; c (G cc ii + 77) + z (18) 

where x e R 2N is the output of the CC, and (HG 9C ) is the effective channel seen by the CC. This effective 
channel is obtained as follows: H is the real representation of the 2x2 MIMO channel in (fToT) . and 
G' gc = 1^/2 ® Ggc where GgJ G M 8x8 is real representation of the Golden constellation generator matrix 
in (fT71) . The model is (fTSl is based on observing that the CC can be viewed as a construction A lattice 
code with hypercubic shaping and a generator matrix G cc . Now, we multiply by the feedforward filter F 
of the MMSE-DFE for the effective channel (HG 9C ) and then use Fano search algorithm on the composite 
channel-code generator matrix FHG 9C G CC . 

4.2 Decode and Forward (DF) Cooperation 

To the best of our knowledge, within the class of DF strategies, the dynamic decode and forward (DDF) 
achieves the best DMT [1] (the same protocol was independently discovered in different contexts [20, 
21]). This motivates developing low complexity variants of this strategy which are particularly suited 
for implementation. We take a step by step approach where a number of lemmas, that characterize the 
modifications needed for complexity reduction while maintaining a good performance, are derived. For 
the sake of completeness, we first describe the DDF protocol. 

In the DDF protocol the source transmits data, at a rate of R bits per channel use (BPCU), during every 
symbol-interval in the codeword. A codeword is defined as M consecutive sub-blocks, during which all 
channel gains remain fixed. Each sub-block is composed of T symbol-intervals. The relay listens to the 
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source for enough sub-blocks until the mutual information between its received signal and source signal 
exceeds NTR. It then decodes and re-encodes the message using an independent Gaussian code-book and 
transmits it during the rest of the codeword. We denote the signals transmitted by the source and relay by 
{xk}^=\ and {xk}*k=M'T+-i> respectively, where N' is the number of sub-blocks that the relay waits before 
starting transmission. Using this notation, the received signals (at the destination) can be written as 



Vk 



9ix k + v k for N'T > k > 1 

9ix k + 92X k + v k for NT > k > M'T 

where g± and g 2 denote the source-destination and relay-destination channel gains, respectively. It is now 
clear that the number of sub-blocks that the relay listens, should be chosen according to 

r r mr 

M = min <^ M. 



log 2 (l + |/i| 2 cp) " (19) 
where h is the source-relay channel gain. In this expression, c = a^/a^ denotes the ratio of the destination 
noise variance, to that of the relay. The following result from [1], describes the DMT achievable by the 
DDF protocol as T — > oo and M — > oo. 

Theorem 2 The diversity-multiplexing tradeoff achieved by the DDF protocol is given by 

\ 2(1 - r) if \ > r > 

d(r) = { / , ■ < 2 °) 

[ (l-r)/r if l>r>| 

It is evident from the protocol description that the achievability result in [1] relied on using independent 
Gaussian codebooks at the source and relay nodes. This approach will potentially require a computation- 
ally intensive algorithm at the destination to implement joint decoding for the source and relay signals. 
Allowing the relay node to start transmission at the beginning of every sub-block, based on the value of the 
instantaneous mutual information, is another potential source for complexity. In practice, this requires the 
source to use a very high-dimensional constellation (with a very low rate code) such that the information 
stream is uniquely decodable from one sub-block if the source-relay channel is good enough. This feature 
also impacts the amount of overhead in the relay-destination packet since the destination must be infor- 
mation with the starting time of relay transmission. Now, we introduce two simplifications of the original 
DDF protocol that aim to lower the complexities associated with these two properties. 

1. After successfully decoding, the relay can correctly anticipate the future transmissions from the 
source (i.e., x k for M'T + 1 < k < MT) since it knows the source codebook. Based on this 



knowledge, the relay implements the following scheme, i.e. 



x* for k = M'T + 1, M'T + 3, 



and 



-x 



k-l 



for k = M'T + 2, M'T + 4, 



(21) 



which reduces the signal seen by the destination for M'T + 1 < k < MT to an Alamouti constel- 
lation. 

2. We allow the relay to transmit only after the codeword is halfway through,i.e., we replace the rule 
in <UH> with 



M' = min < M, max 



M 



MR 



log 2 (l + |/i| 2 cp) 



(22) 



Fortunately, these modifications do not entail any loss in performance (at least from the DMT perspec- 
tive) as formalized in the following lemma. 

Lemma 3 The modified (lower complexity) DDF protocol still achieves the same DMT in Theorem^ 

Proof: To prove the first part of the lemma, let us denote the signals received at the destination by 
{y k }tl Then 

I gi x k + v k for k = 1, • • • , M'T 

Vk = { gix k + g 2 x* k+1 + v k for k = M'T + 1, M'T + 3, • • • 
9ix k - g 2 x* k _ 1 + v k for k = M'T + 2, M'T + 4, ■ • • 

Now, through linear processing of {y k } k J =1 T, the destination derives {yfcj-j^T such that 



Vk 



9ix k + v k for k = 1, 



M'T 



(23) 



\92\ 2 x k + v k for k = M'T + 1, • • • , MT 

with v k being statistically identical to v k . Using (1231 . it is straightforward to see that destination pairwise 
error probability, averaged over the ensemble of Gaussian codes (used by the source) and conditioned on 
a certain channel realization, is given by 

n ^ f-, 1| 12 \-M'T /„ I/, ,2 i i2\ \-{M-M')T 

PpE\ 9u92>h < (l + -\gi\ 2 p) (l + -(M 2 + |<7 2 |» 

This last expression, though, is identical to the one corresponding to the original DDF protocol (refer 
to [1]). This means that this implementation of the DDF strategy will achieve the same DMT as Theorem[2] 
as T — » oo and M — > oo which completes the proof of the first part. 
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To prove the second part, we notice that the effect of constraining the relay to start transmission only 
after the codeword is halfway through, i.e., adopting the rule (l22b . is to replace equation (52) in [1] with 

( if / = \ 

inf u = < 2 , (24) 

+ J \ (1 - L)+ if 1 > / > 1 

where u denotes the exponential order of l/|/i| 2 . Now, since (l2"4l is different from (52) in [1] only for 
/ = 1/2, for which ird{vi + v 2 ) is already equal to the optimal value 2(1 — r), we conclude that this 
restriction does not affect the DMT achieved by the protocol. □ 

As shown in (l23t . the channel seen by the destination in the modified DDF protocol is a time- selective 
SISO which facilitates leveraging standard SISO decoding architectures (e.g., belief propagation, Fano 
decoder) at the destination. In addition, by restricting the relay to transmit only after M' > M/2 means 
that the constellation size can chosen such that the information stream is uniquely decodable only after 

M' = M/2. 

The next result investigates the effect of allowing the relay to transmit only at a finite number of 
instants. These instants partition the code word into N + 1 segments which are not necessarily equal in 
length. We assume that the j-th segment starts at the beginning of sub-block Nj + 1, denote the set of 
fractions {/j}JLi by to N waiting fractions fj such that fj = with / = and /jy+i — 1. Thus 

fo = </!<••-< f N < f N +l = I- 

The question now is how to choose {/j}jLi» f° r a finite N, such that the protocol achieves the optimal 
DMT. The following lemma shows that this problem does not have a unique optimal solution and charac- 
terizes a Pareto optimal set of fractions. 

Lemma 4 For the DDF protocol with a finite N, 

1. there exists no uniformly dominant set of fractions {fj}^ =l . 

2. let fi — j and 

ff = J '-\ , for N > j > 1 (25) 

3 2-(l + -^)/?_ 1 ' " 

then the set of fractions {fj}^ =1 is Pareto optimal, with 

rfP ( r) = l_ r + (l __?_)+. (26) 

In 
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Proof : We note that the outage set for the DDF protocol with a general set of waiting fractions, {fj}f =1 , 
is still given by equation (49) in [1], i.e. 

+ = {(«i, v 2 , u) E R 3+ |/(l - v,) + + (1 - /)(1 - mm{ Vl ,v 2 }) + < r}. 

The only difference is that / is now given by (compare with (52) of [1]) 



/= < 



h if l-£>u>0 

u if 1 -j- >u ^( 1 -jf-J + ■ (27) 

1 if u > (1 - ■£-)+ 

— v IN ' 



Next, we split + such that 

0+ = uf^O;, where 0+ = {(v 1} v 2 , u) e + \f = /,}. (28) 
Now, from (EH) and (EU) we get 

D for j — 1 

fl - t^) + for iV + 1 > ? > 2 

Also, since /j > |, we have inf( Wlj „ a ) e0 +(vi + u 2 ) = (1 - r)//j (refer to (55) in [1]). Thus 

dj(r) = inf(ui + u 2 + u), 



But, (EHJ) along with results in 



d(r) = mm dj(r), 

fj> r 

d(r) = mm ^ + (1 - -^—)+. (30) 
• r < /j /i-i 

Now let us assume that the set of waiting fractions is uniformly optimal. Pick {/j}jLi sucn 

that fx < f N < 1. Then from (l30l we conclude that for any < r < f N , d u (r) = 1 — r and 
d(r) = 1 — r + 1 — r//jv. Thus <i n (r) < cf(r), which is in contradiction with the uniform optimality 
assumption of To prove the second part of the lemma, we observe that for N > j > 1, {fj}f =1 

as given by (l2"5t results in 

< +1 (r) < dj(r), for ft > r > 0,r ^ (31) 
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and 



= (32) 

Now (|3"T1) . along with ( QUI) and d2"9t proves that achieves ( EoT ). The only thing left is to show that 

d p (r) is indeed Pareto optimal, i.e., no other set {fj}f =1 dominates {/J}jLi> m me Pareto sense. To do so, 
we assume such a set exists and observe that since f = ffi = and /jv+i = /Jf+i — 1> there should be 
iV + l>z>landiV + l>£>l such that 

/i-i < < ft < fi, or (33) 
fi-i < f t _i < ft < U (34) 

Now, if d33t is true, then we observe from (l30b that 



Ji it 



where we have used (l3*2l in deriving the last step. This, however is in contradiction with the Pareto 
dominance of {fj}, since 

d{fU)<d^flx)- 

On the other hand, if (f3"4t is true, then 

d(fi-x) < ^(/i-i) = < 2 - (1 + = ^(/i-O, 

Ji J" AT 

or 

< d"(/i-i), 

which again is in contradiction with Pareto dominance of {fj}. This completes the proof of the second 
part. □ 

Figure El shows the DMT for Pareto optimal DDF protocols with N = 1({§}), N = 2({|, |}) and 
N = oo. 
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4.3 Numerical Results 

Throughout this section, we consider construction-A lattice codes obtained from systematic convolutional 
codes (CC) over Zq with Q a prime number. Unless otherwise mentioned, we choose the SNR level of 
the inter-user channel to be 3 dB higher than the SNR at the destination. In all scenarios, the MMSE-DFE 
Fano decoder uses a bias b = 1.2 and a step- size A = 5 [22]. 

First, we demonstrate the performance improvement obtained by augmenting the Golden constellation 
of [10] with a lattice code obtained from a rate 1/2 systematic CC. In both schemes, the frame length is 
128. We assume that the source and the relay transmit with equal power, and that the source transmits with 
same power at all instants. Figure[l]shows the frame error rate (FER) of the two coding schemes as a func- 
tion of the average received SNR at the destination. For the Golden constellation, 4— QAM, 16— QAM and 
64— QAM modulations for the information symbols lead to transmission rates of 2, 4 and 6 BPCU, respec- 
tively. For comparison, we use primes Q = 5, 17 and 67, resp., to achieve the corresponding transmission 
rates (slightly higher though). From the figure, we see that augmenting the Golden constellation with the 
CC provides performance improvement of about 1 — 1.5 dB. The inferior performance of the augmented 
code for the 2 BPCU is due to the fact that the effective transmission rate of the code, i.e., log 2 (5) = 2.32, 
is significantly higher than 2 BPCU offered by Golden constellation. This choice is dictated by the need to 
choose Q as a prime so we have a nice lattice representation for the received signal which is instrumental 
for the MMSE-DFE Fano decoder. 

Next, we proceed to the modified DDF protocol proposed in Section l4~2l At the source, the information 
stream is appended with 16 CRC bits, and the resulting vector is encoded using a rate 1/4 systematic CC. 
The relay attempts decoding after waiting Ni sub-blocks, where iVj is the smallest among the set of allowed 
waiting times such that the mutual information at the relay exceeds the received rate. In order to avoid 
error propagation at the destination, the decoded vector is checked for validity using the CRC bits. If the 
decoder vector is valid, i.e., it satisfies the CRC check, then the relay uses the modified DDF protocol. If 
the decoded stream at the relay does not satisfy the CRC check, then the relay attempts decoding again at 
the next allowed waiting time (N i+ {), and so on. At the receiver we use the MMSE-DFE Fano decoder 
(note that the MMSE-DFE part is now a trivial scaling since the channel seen by the destination is SISO). 
We also assume that the destination knows the time at which the relay starts transmitting (via overhead 
bits). For the range of transmission rates considered in the sequel, it turns out that increasing the number 
of segments beyond 3 provides negligible increase in performance. Figure |3] shows the outage probability 
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of DDF relay protocol, when the codeword partitioned into 3, 4, 5 and 6 segments. As seen from the figure, 
the gap between the outage curves is negligible. Therefore, in the sequel, we consider only the variant of 
DDF relay protocol with 3 segments. Moreover, we choose the waiting fractions /j according to Lemma|4j 
i.e., the pareto-optimal set of waiting fractions for 3 segments, given by {-, |}. Figure |4] compares the 
performance of our variant of the DDF protocol with that of the NAF protocol for 2 and 3 BPCU. The 
proposed DDF strategy offers a gain of about 4 dB( and about 6 dB) over the NAF scheme, for 2( and 
3) BPCU. Note, however, that the DDF protocol entails an complexity at the relay, as compared with the 
NAF protocol, since the relay needs to decode the information stream. Finally, we note that although 
several implementations of Amplify-and-Forward and Decode-and-Forward variants exist ( [5], [7], [6]), 
all these works consider low transmission rates. However, the impact of superior DMTs of our schemes 
does lead to significantly better performance at low transmission rates. It should be noted, however, that 
the difference in performance between the protocols that are spectrally efficient and those that are not 
manifests itself only at high transmission rate scenarios. Therefore, we focus on high transmission rates 
in this paper. 



5 The Cooperative Multiple Access (CMA) Channel 

In this section, we implement the CMA-NAF protocol for two users. We start with describing the proto- 
col. In the CMA-NAF protocol, each of the two sources transmits once per cooperation frame, where a 
cooperation frame is defined by two consecutive symbol-intervals. Each source, when active, transmits 
a linear combination of the symbol it intends to send and the (noisy) signal it received from its partner 
during the last symbol-interval. For source j and frame k, we denote the broadcast and repetition gains 
by dj and bj, respectively, the symbol to be send by Xj±, and the transmitted signal by t jik . At startup the 
transmitted signals will take the form 

= a x x hl (35) 
t 2l = a 2 x 2 ,i + b 2 {ht ltl + W2,i) (36) 
*i,2 = «i^i,2 + b 1 (ht 2 ,i + w 1A ) (37) 
h,2 = a 2 x 2 ,2 + b 2 (hti j2 + w 2>2 ) (38) 
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where h denotes the inter-source channel gain and Wj^ the noise observed by source j during the frame k. 
(We assume that Wj^ has variance cr^.) The corresponding signals received by the destination are 



J/1,1 = 01*1,1 + ^1,1 

1/2,1 = 02*2,1 + V 2 ,l 

01,2 = 01*1,2 + Vl,2 

02,2 = 02*2,2 + V 2,2 



(39) 
(40) 
(41) 
(42) 



where gj is the gain of the channel connecting source j to the destination and Vj t k the destination noise 
of variance a%. Note that, as mandated by our half-duplex constraint, no source transmits and receives si- 
multaneously. The broadcast and repetition gains {aj, bj} are (experimentally) chosen to minimize outage 
probability at the destination. As a consequence of symmetry, a\ and a 2 , as well as b\ and 62, will have 
the same optimal value. Thus, we assume that broadcast and repetition gains are the same at each source 
and omit the subscripts, yielding {a, b}. We also define a codeword as N consecutive symbol-intervals 
(assuming that N is even, this means that there are N/2 cooperation frames in each codeword). The 
following result from [1] gives the diversity-multiplexing tradeoff achieved by this protocol. 

Theorem 5 The CMA-NAF protocol achieves the optimal diversity-multiplexing tradeoff of the channel, 
i.e., for two users 



d(r) = 2(l-r). 

From (I35H42I) . the received vector at the destination can be written as 



yi = HxXx + Bw + v, 



(43) 



(44) 



where y = [j/2,at0i,jv • • • 02,i0i,i] T £ C is the vector observed at the destination, x x = [x 2j n x%,n ■ ■ ■ ^2,1^1,1 
is the vector formed by multiplexing the codewords transmitted by the two sources, w G C^ -1 and 



v G 



are the AWGN vectors at the sources and the destination, respectively. The effective channel 



matrix, Hx G C NxN is the upper triangular matrix 

1 



Hi = oD, 



(bh) 

1 



(bh) N 
(bh)"- 1 
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where D 3 = diag(£f 2 , gi, g 2 , ■-, g 2 , gi)- Similarly, B G 



iNx(N-l) 



is given by 



ibh) 
1 



B = 6D, 



■ (bh) N ~ l 

■ (bh) N ~ 2 

1 



The noise (Bw + v) in (l44b is colored, and must be whitened before tree search with the MMSE-DFE 
Fano decoder can be performed. Let S = cr^BB^ + cr^I be the covariance matrix of (Bw + v). Then, 



y c = H c x^ + E^(Bw + v) 



(45) 



where y c = S _ 5y 1? H c = IT^Hi and the noise vector z c = £~2(Bw + v) now consists of i.i.d Gaussian 
components with variance 1. may be computed by any of the standard methods, i.e., singular value 
decomposition, Cholesky decomposition or QR decomposition. The input-output relation in (l45t can now 
be written in the standard form (flOb by separating the real and imaginary parts, to obtain the real 2N x 2N 
system: 

y = Hx + z (46) 

We construct the generator matrix G of the two sources combined, as seen by the decoder, by appropriately 
multiplexing the rows and columns of the lattice generators of the two sources, Gi and G 2 . Fano decoding 
can now be done over the resulting joint channel-code lattice, after MMSE-DFE preprocessing of the 
channel matrix H. Finally, the decoded codewords of the two sources are obtained by demultiplexing the 
decoded lattice point. 



5.1 Numerical Results 

In this section, we compare the performance of the lattice coded CMA-NAF scheme with that of other 
schemes. The frame length is chosen to be N = 128. Each user encodes its information stream using a 
construction-A lattice code obtained from a systematic convolutional code over Zq. At the destination, 
MMSE-DFE Fano decoding is performed over the joint code-channel lattice of both the users. We define 
the frame error event as {(x 1? x 2 ) ^ (x 1; x 2 )}, where and x 2 refer to the decoded codewords. Figure |3 
shows the frame error rate of the lattice coded CMA-NAF protocol vs. when the two sources cooperate 
according to NAF relay protocol. We show the performance for 2 and 4 BPCU, and the coding scheme 
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achieving the better performance (Golden constellation alone/Golden constellation+CC) is chosen for the 
NAF relay protocol. Figure |5] also shows the frame error rate of the CMA-NAF protocol, when both 
sources use uncoded QAM constellations for transmission. As in Figure [H the worse performance of 
lattice coded CMA-NAF w.r.t uncoded QAM can be explained by the higher transmission rate of the 
lattice code (with Q = 5). Both coded and uncoded transmission with CMA-NAF protocol perform 
significantly better than the NAF-relay protocol. The performance gap between the two schemes widens 
as the transmission rate increases, which can be explained by the superior DMT of the CMA-NAF protocol 
over the NAF-relay protocol. Figure^shows the bit error rate performance of uncoded QAM transmission 
with CMA-NAF protocol, vs. NAF-relay protocol with the Golden constellation. Again, CMA-NAF 
protocol with uncoded transmission shows bit error rate improvement of about 3 dB at 2 BPCU, and about 

5 dB at 4 BPCU. Finally, Figures [7] and [8] show the performance and complexity trade-off of the MMSE- 
DFE Fano decoder w.r.t bias for the CMA NAF protocol with CC lattice coded transmission (for N = 64). 
We see from the figures that complexity of the Fano decoder can be reduced at the expense of complexity. 
In our simulations, we use b = 1.2 to achieve good performance with reasonable complexity. 

6 Conclusions 

We have developed a lattice theoretic framework for cooperative coding in half-duplex outage-limited 
channels. This framework achieves the cooperative diversity gains promised in [1] and enjoys realiz- 
able encoding and decoding complexity. In the relay channel, the proposed scheme achieves the optimal 
diversity-multiplexing tradeoff for r < 0.5 while reducing the channel between the source and relay, on 
one side, and the destination, on the other side, to a time selective SISO channel. This simplification 
allows for using traditional receiver architectures in this scenario. A tree coding approach that achieves 
the optimal diversity-multiplexing for the cooperative multiple access channels is developed. In this con- 
text, the MMSE-DFE Fano decoder is shown to yield an excellent performance-vs-complexity tradeoff. 
The significant performance gains offered by the proposed schemes are validated via a comprehensive 
simulation study. 
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Figure 1: Performance of NAF relay with Golden constellation, with the CC outer code (denoted by 
GC+CC) and without any outer code(denoted by GC). 




Figure 2: Pareto optimal diversity-multiplexing tradeoff for the DDF protocol with N = 1, 2 and oo. 
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Figure 3: Outage performance of DDF relay protocol with 3, 4, 5 and 6 segments. 
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Figure 4: Performance of DDF relay protocol (with 3 segments) vs. NAF relay protocol. 
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Figure 6: BER performance of CMA-NAF protocol with uncoded QAM vs. Relay NAF protocol with 
Golden constellation. 
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Figure 7: Performance and Complexity of MMSE-DFE Fano decoder with CM A NAF protocol, for R = 2 
BPCU. 
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Figure 8: Performance and Complexity of MMSE-DFE Fano decoder with CM A NAF protocol, for R = 4 
BPCU. 
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